Indraneel's blog

Feeble attempts at grokking the incomprehensible.

Configuration system

Any reasonably sized service has a large number of variables whose values are not known at the time of development. This has various reasons, some listed below

  • The code is deployed to different environments and each environment has a different value
  • The value needs to be configured based on performance numbers
  • There is separation of responsibilities and the developer expects inputs whose value is only known to operations

Regardless of reason, there is a need for the code to lookup runtime values, which may be stored in config files, environment variables, databases etc. Often the same value is stored in numerous places with slight or no modifications. It is customary for the deployment code to set/update the values where ever needed.

Setting configuration values

The team will often agree on a configuration system, that involves a configuration template, which is instantiated once per environment into an environment specific configuration repository. At the time of deployment, the setup code will read the configuration repository and replace the variable values.

Configuration template

This is the notional/logical representation of values needed in the environment. The template defines the structure of how a fully realized environment configuration would look like.
Some folks will keep this piece implicit but even they can tell you that for a configuration set to be valid, they expect some entities to exist.
Some teams will keep a simple flat structure in their configuration system, consisting of name-value pairs describing the entire environment. Others will create a hierarchy of sorts or enforce a structure/schema in their configuration system.
Teams that make the template explicit will often create an XML or JSON file to represent it. Inspecting the file will show some patterns, e.g.

  • The machines, or endpoints for their preferred PAAS vendor
  • Run time properties specific to the service
  • Certificate details
  • Secrets

The last 2 are likely antipatterns, because unless security has been carefully considered, people should avoid putting in secrets or certificate information in their configuration system. On this note, take a look at the section on template substitution tool and its plugins.

Configuration repository

This represents the instantiated version of the template specific to an actual environment that the code is to be deployed to. It will have real machine names, real run time values etc. It is recommended that this file be version controlled in an environment specific folder because this file's contents are always in flux. The file gets changed in numerous ways

  • The developers are going to be constantly adding new properties. They may know their correct environment values in advance, or they may not
  • The operations/devops guys will be constantly updating the values that the developers did not know the values for
  • The developers will periodically be requesting for new machines/PAAS endpoints by adding placeholders in this file
  • The operations/devops guys will be updating the correct values for machines after allocation is complete

Configuration parser library

This is an optional but highly recommended piece. Rather than everyone finding their own way to parse the configuration repository, a team wide library can be used in multiple ways. It can confirm that the configuration repository and the configuration template are in alignment and check that values are in legitimate ranges. It can provide helper functions to locate properties and their values and navigate heirarchies if needed. It can also confirm that if there are application specific gotchas in configuration, those are caught before the violation makes it to the environment.

When looking for a value in the configuration system, very often, the context matters. In a small environment, every value is unique. In a large environment, there are numerous possible values, and the correct value depends on the scale unit/machine you are on, when you look for the value. The library should make that context easy to supply.

Saving the configuration values in the environment

There are different options when it comes to how the values are persisted into the environment. Your service may expect to find values in different targets, e.g. environment runtime, the database, registry etc. For those cases, you should build a target specific tool that queries the configuration parser library and saves the values. Make every team member use that generic tool.
It is however very common for teams to exclusively store values in configuration files (especially in projects that believe in zero install deployments). In such cases it makes sense to write configuration files are templates, and have a deployment time template substitution code (more details below).

Template substitution code for configuration

Such a system involves having the following pieces

  1. Developers check in templatized versions of their config files. The team must agree to using the same format of configuration files(e.g. xml or json)
  2. The team must also agree to a DSL to indicate variables needing replacement. If the configuration repository has heirarchies, the DSL should account for it
  3. The setup infrastructure should locate the templatized config file, and run them past a dedicated template substitution tool which can use the parsing library and create instantiated config files

Adding plugins to the template substitution tool

While configuration repository is the most common usage for template substitution tool, there are few other libraries that can plug into the tool e.g.

  • Certificate lookup libaries can embed cert thumbprints in your config files
  • Passwords can be pulled from your secret store, encrypted and saved in your config files
  • Your PAAS provider may have libaries to discover private endpoints etc. which can then be saved in your config files

Your DSL needs to be flexible enough for developers to indicate the source of the value, and your template tooling can then handle the value substitution by calling the appropriate library. This way, you can leave the security concerns to people who are actually qualified to handle them.